Goto

Collaborating Authors

 florent krzakala


LearningGaussianMixtureswithGeneralisedLinear Models: PreciseAsymptoticsinHigh-dimensions

Neural Information Processing Systems

We exemplify our result in two tasks of interest in statistical learning: a) classification for a mixture with sparse means, wherewestudytheefficiencyof `1penaltywithrespectto `2;b)max-marginmulticlass classification, where we characterise the phase transition on the existence ofthemulti-class logistic maximum likelihood estimator forK >2.





The committee machine: Computational to statistical gaps in learning a two-layers neural network

Benjamin Aubin, Antoine Maillard, jean barbier, Florent Krzakala, Nicolas Macris, Lenka Zdeborová

Neural Information Processing Systems

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows to perform optimal learning in polynomial time for a large set of parameters.


Deep Learning of Compositional Targets with Hierarchical Spectral Methods

Tabanelli, Hugo, Dandi, Yatin, Pesce, Luca, Krzakala, Florent

arXiv.org Machine Learning

Why depth yields a genuine computational advantage over shallow methods remains a central open question in learning theory. We study this question in a controlled high-dimensional Gaussian setting, focusing on compositional target functions. We analyze their learnability using an explicit three-layer fitting model trained via layer-wise spectral estimators. Although the target is globally a high-degree polynomial, its compositional structure allows learning to proceed in stages: an intermediate representation reveals structure that is inaccessible at the input level. This reduces learning to simpler spectral estimation problems, well studied in the context of multi-index models, whereas any shallow estimator must resolve all components simultaneously. Our analysis relies on Gaussian universality, leading to sharp separations in sample complexity between two and three-layer learning strategies.



Subspace Phase

Neural Information Processing Systems

The means, oni2[d], will componentsvi. Foragiven(x ) 2[n], define X 2 Rd n with Rn k,V2Rd kwithrou c andvi.



high

Neural Information Processing Systems

We show it depends on the precise way in which the limit is taken, and in particular on how the quantityofdata,thehiddenlayerwidth,&thelearningratescalesasd .